AITopics

2605.19052

Country:

Europe (0.46)
Asia (0.28)

Genre: Research Report (0.64)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Allerbo, Oskar, Schön, Thomas B.

A Rigorous, Tractable Measure of Model Complexity

arXiv.org Machine LearningMay-21-2026

One of the most fundamental properties of a machine learning model is its complexity, with applications across topics such as interpretation, generalization, and model selection. Despite its importance, there is no canonical, model-agnostic way to assess a model's complexity. While simple heuristics, such as the number or magnitude of parameters, yield very crude estimates, hyperparameter-based approaches, such as polynomial degree or kernel length scale, do not generalize across model classes. More rigorous methods, including the Vapnik-Chervonenkis dimension (VCD) (Vapnik, 2013), Rademacher complexity (RMC) (Bartlett and Mendelson, 2002), and effective number of parameters (or effective degrees of freedom, ENP) (Efron, 1986), are difficult, or even impossible, to compute in practice, leaving the user to resort to crude bounds and/or approximations. The topic is further complicated by the often overlooked distinction between model and function complexity, where the former sets a ceiling on the latter.

artificial intelligence, complexity, machine learning, (18 more...)

2605.21167

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Hashimoto, Yuka, Iwata, Tomoharu

Unified generalization analysis for physics informed neural networks

arXiv.org Machine LearningMay-14-2026

Physics-Informed Neural Networks (PINNs) and their variational counterparts (VPINNs) are neural networks that incorporate physical laws, making them useful for scientific problems. Existing generalization analyses for PINNs and VPINNs remain limited, often requiring restrictive assumptions such as stability conditions or linear ellipticity. In this paper, we derive generalization bounds for neural networks that involve differentiation with respect to input variables, covering PINNs and VPINNs under a unified framework. We apply Taylor expansion to represent nonlinear differential operators as linear operators on a high-dimensional space, enabling the use of Koopman-based analysis and showing that high-rank networks can generalize well even in settings involving differential operators. We also show that the nonlinearity of the differential operator exponentially enlarges the bound, highlighting its significant impact on generalization.

artificial intelligence, machine learning, neural network, (19 more...)

2605.1326

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Taniguchi, Koichi, Sonoda, Sho

Generalization Error Bounds for Picard-Type Operator Learning in Nonlinear Parabolic PDEs

arXiv.org Machine LearningMay-12-2026

Operator learning for partial differential equations (PDEs) aims to learn solution operators on infinite-dimensional function spaces from finite-resolution data. In this setting, it is important for the learned model to be discretization-invariant, or resolution-robust, and to reflect PDE-specific structure. It is therefore natural to ask how such structure should be encoded in the model architecture, hypothesis class, or learning procedure. In this paper, we study operator learning for solution operators of nonlinear parabolic PDEs based on Duhamel--Picard iteration. We formulate Picard iteration as an abstract state-transition model and present a theoretical framework for Picard-type operator learning. We derive implementation-agnostic generalization error bounds that separate the implementation error from the estimation error associated with the abstract state-transition model induced by Picard iteration. A key consequence is that increasing the Picard depth reduces the Picard truncation error without causing an unbounded growth of the entropy-based estimation error. We also extend the analysis to long-time prediction by rolling out the same learned local model over successive time blocks. Finally, we illustrate the theory for nonlinear heat equations on the torus using a Picard-type Fourier neural operator as a concrete implementation.

artificial intelligence, machine learning, operator, (17 more...)

2605.10277

Country:

Europe (0.92)
Asia (0.67)
North America > United States > New York (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Neural Information Processing SystemsApr-30-2026, 01:33:24 GMT

Analyzing Generalization of Neural Networks through Loss Path Kernels

Deep neural networks have been increasingly used in real-world applications, making it critical to ensure their ability to adapt to new, unseen data. In this paper, we study the generalization capability of neural networks trained with (stochastic) gradient flow. We establish a new connection between the loss dynamics of gradient flow and general kernel machines by proposing a new kernel, called loss path kernel. This kernel measures the similarity between two data points by evaluating the agreement between loss gradients along the path determined by the gradient flow. Based on this connection, we derive a new generalization upper bound that applies to general neural network architectures. This new bound is tight and strongly correlated with the true generalization error. We apply our results to guide the design of neural architecture search (NAS) and demonstrate favorable performance compared with state-of-the-art NAS algorithms through numerical experiments.

artificial intelligence, generalization, machine learning, (18 more...)

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsApr-26-2026, 21:39:47 GMT

Cycle Self-Training for Domain Adaptation

Mainstream approaches for unsupervised domain adaptation (UDA) learn domaininvariant representations to narrow the domain shift, which are empirically effective but theoretically challenged by the hardness or impossibility theorems. Recently, self-training has been gaining momentum in UDA, which exploits unlabeled target data by training with target pseudo-labels. However, as corroborated in this work, under distributional shift, the pseudo-labels can be unreliable in terms of their large discrepancy from target ground truth. In this paper, we propose Cycle Self-Training (CST), a principled self-training algorithm that explicitly enforces pseudo-labels to generalize across domains.

adaptation, artificial intelligence, machine learning, (18 more...)

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Neural Information Processing SystemsApr-25-2026, 23:45:37 GMT

Generalization_bound_for_RO-3

artificial intelligence, generalization, machine learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsApr-25-2026, 16:55:41 GMT

On Learning Latent Models with Multi-Instance Weak Supervision

We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function σ of labels associated with multiple input instances. We formulate this problem as multi-instance Partial Label Learning (multi-instance PLL). Our problem is an extension to the standard PLL problem and is met in different fields, including latent structural learning and neuro-symbolic integration. Despite the existence of many learning techniques, limited theoretical analysis has been dedicated to this problem. In this paper, we provide the first theoretical study of multi-instance PLL with possibly an unknown transition σ.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)